
WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




PCX 

INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classification 6 : 
G06T 7/00, 15/20 



Al 



(11) International Publication Number: WO 99/60525 

(43) International Publication Date: 25 November 1999 (25. 11. 99) 



(21) International Application Number: PCT/GB99/01 556 

(22) International Filing Date: 17 May 1999 (17.05.99) 



(30) Priority Data: 

9810553.9 
9910960.5 



15 May 1998 (15.05.98) GB 
12 May 1999 (12.05.99) GB 



(71) Applicant (for all designated States except US): TRICORDER 

TECHNOLOGY PLC fGB/GB]; Unit 2, The Long Room, 
Royal Quay, Coppermill Lock, Summerhouse Lane, Hare- 
field, Middlesex UB9 6JA (GB). 
f 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): WILSON, Jeremy, David, 
Norman [GB/GB]; "Sea View", Cliff Road, Craft Hole, 
Torpoint, Cornwall PL11 3BL (GB). MEIR, Ivan, Daniel 
[GB/GB]; 18 North Park, Gerrards Cross, Buckinghamshire 
SL9 8JW (GB). HOLDBACK, Jonathan, Anthony [GB/GB]; 
2 Montpelier Court, Montpelier Road, Ealing W5 2QN 
(GB). 

(74) Agent: DOBLE, Richard, G., V.; 38 Spring Street, London W2 
1JA (GB). 



(81) Designated States: AE, AL, AM, AT, AU. AZ, BA, BB, BG, 
BR BY, CA, CH, CN, CU> CZ, DE. DK, EE, ES, Fl, GB, 
GD GE. GH. GM, HR, HU. ID, IL, IN, IS, JP, KE, KG, 
KP KR, KZ, LC, LK, LR, LS, LT, LU, LV, MD, MG, MK, 
MN, MW, MX, NO, NZ, PL, PT, RO, RU, SD, SE. SG, SI, 
SK SL, TJ, TM, TR, TT, UA, UG. US, UZ, VN, YU, ZA, 
ZW, ARIPO patent (GH, GM, KE, LS, MW, SD, SL, SZ, 
UG, ZW), Eurasian patent (AM, AZ, BY, KG, KZ, MD, 
RU, TJ, TM), European patent (AT, BE, CH, CY, DE, DK, 
ES Fl, FR, GB, GR, IE, IT, LU, MC. NL, PT, SE), OAPI 
patent (BF, BJ, CF, CG, CI, CM, GA, GN. GW, ML, MR, 
NE, SN, TD, TG). 



Published 

With international search report. 

Before the expiration of the time limit for amending the 
claims and to be republished in the event of the receipt of 
amendments. 



(54) Title: METHOD AND APPARATUS FOR 3D REPRESENTATION 




CM' 



(57) Abstract 

An arrangement (Figure 1) for generating a 3D representation of an object (3) comprises a camera (CM) freely movable to a further 
viewpoint (CM 1 ) and arranged to acquire respective overlapping images at each viewpoint. The orientation of the camera is maintained 
roughly constant at the two viewpoints and after correlating the images e.g. by Gruen's algorithm an approximate 3D representation is 
generated in a computer (4) by deriving the vector (V, Figure 3) between the viewpoints. In other embodiments (Figures 13 and 15) the 
object is illuminated with structured optical radiation and a 3D representation is derived from the correlation of images of the object and a 
target acquired at a common viewpoint. Methods of tracking the camera (Figures 4 to 6) are also disclosed. 
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Method and Apparatus for 3 H Representation 

The present invention relates to a method and apparatus for deriving a three- 
dimensional representation from two or more two-dimensional images. 

As is well known, in general such a 3D representation can only be generated if the 
features (eg points) of the images are correlated. Respective features of two 
overlapping images are correlated if they are derived from (ie conjugate with) the 
10 same feature (eg a point) of the object. If the positions and orientations of the 
camera at which the overlapping images are acquired are known, then the 3D 
coordinates of the object in the region of overlap can be determined, assuming that 
the camera geometry (in particular its focal length) is known. 

15 Hu ex al "Matching Point Features with ordered Geometric, Rigidity and Disparity 
Constraints" IEEE Transactions on Pattern Analysis and Machine Intelligence Vol 
16 No 10, 1994 ppl041 -1049 (and references cited therein) discloses suitable 
algorithms for correlating features of overlapping images. Such algorithms can be 
used in the present invention and will not be discussed further. However it should be 

20 noted that in some circumstances (eg when the object is relatively simple or one of 
the known algorithms has failed to correlate sufficient features to enable an accurate 
reconstruction of part of an object to be derived) the features of the overlapping 
images can be correlated by eye and the correlation recorded on screen when using 
an embodiment of the method and apparatus of the present invention. 

25 

Assuming that the correlated points of the images have been derived, in principle the 
information about the position and orientation of the camera required to reconstruct 
the object in 3D can be obtained either by direct measurement or by various 
sophisticated mathematical techniques involving processing all the correlated pairs 
of features. (Stereoscopic camera arrangements in which the cameras are fixed are 

30 

ignored for the purposes of the present discussion). 

EP-A-782,100 discloses a method and apparatus in the first category, namely a 
photographic 3D acquisition arrangement in which a camera displacement signal 
and a lens position signal are used to enable a 3D representation of an object to be 
35 built up from 2D images acquired by a digital camera. Both the position and the 
orientation of the camera are monitored. The hardware required to achieve this is 
expected to be expensive. 



A number of research papers have addressed the more difficult question of deriving 
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the correct orientation from the correlated features of the images, as follows: 

i) E H Thompson "A rational Algebraic Formulation of the problem of Relative 
Orientation" Photogrammetric Record Vol 3 No 14 (1959) pp 152-159 sets out a 
mathematical procedure for aligning two images of the same scene to the correct 
orientation for reconstructing the scene in 3D, involving an iterative solution of five 
simultaneous equations derived from five pairs of correlated points. 

ii) Richard Hartley et al "Stereo from uncalibrated cameras" Proc. IEEE Conf 
Computer Vision and Pattern Recognition (1992) pp 761-763 teaches that two 3x4 
camera matrices define the camera orientations and locations as well as the internal 
camera parameters such as focal length. If the cameras are calibrated (ie the internal 
parameters are known) then the camera matrices can be found from the matched 
points and hence the true 3D locations of all the points can be found. If the cameras 
are not calibrated then known "ground control" points must be used to derive the 
camera matrices and hence the 3D locations of the matched points. 

iii) Richard Hartley "Estimation of Relative Camera Positions for Uncalibrated 
Cameras" Computer Vision-ECCV'92, LNCS-Series, Vol 588, 1992 pp 579-587 is a 
development of the above Hartley paper and shows that the focal lengths as well as 

20 the positions and orientations of the cameras can be found from the matched points 
if all the other internal camera parameters are known. 

iv) Hartley et al "Computing Matched-epipolar projections" Computer Vision and 
Pattern Recognition 1993, pp 549-555 is a development of the above which 

25 introduces the epipolar transformation matrix to transform the images to images that 
would be acquired by cameras placed side-by-side with their optical axes parallel. 
The remaining points can then be correlated more easily. Also the cameras do not 
need to be calibrated. 



10 



15 



A disadvantage of the above mathematical methods is the intensive computation 
required to process all the correlated points. Much of this computation involves the 
determination of the camera positions and orientations from multiple points and is 
effectively wasted because the positions and orientations are' grossly over- 
determined by the large number of processed points. 

It has been shown that in general all the information on the viewpoints (ie position 
and orientation) of the camera(s) can be found from eight pairs of correlated points. 
However there are special cases in which this is not possible. One extreme example 
is a situation in which all eight points on the object corresponding to the pairs of 
corelated points in the images are colinear. 
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Importantly, another situation in which the camera positions cannot be determined is 
when the camera orientations (in the reference frame of the object) are identical. As 
will be apparent from Figure 3 (discusssed below) the camera separation in such a 
5 situation is quite indeterminate, no matter how many pairs of points are correlated. 
In such a situation the size of the object cannot be determined from the two images, 
even if the focal length and other camera parameters are known, and the 3D 
reconstruction will need to be multiplied by a scaling factor. 

10 Bearing in mind that the resolution of many digital cameras is severely limited (eg to 
640 x 480 pixels) it will be apparent that if the camera orientations are only slightly 
different then there will be considerable uncertainty in the camera positions and 
hence in the 3D reconstruction of the object. On the other hand if the camera , 
orientations are deliberately chosen to be very different (ie converging sharply on 

15 the object) then points on the object in overhanging regions which are in the field of 
view when one image is acquired will not be in the field of view when the other 
image is acquired, preventing 3D reconstruction of regions of overhang. 

An object of the present invention is to overcome or alleviate at least some 
disadvantages of the known methods and apparatus, particularly when the resolution 

**0 

of the images from which the 3D reconstruction is generated is limited 

Accordingly, in one aspect the invention provides a method of deriving a 3D 
representation of at least part of an object from correlated overlapping 2D images of 
the object acquired from different spaced apart viewpoints relative to the object, the 

- 5 separation between the viewpoints not being precisely known, the method 
comprising the step of digitally processing the 2D images to form a 3D 
representation which extends in a simulated 3D space in dependence upon both the 
mutual offset between correspondences of the respective 2D images and a scaling 
variable, the scaling variable being representative of the separation between the 

30 viewpoints at which the 2D images were acquired. 

The invention also provides image processing apparatus for -deriving a 3D 
representation of at least part of an object from correlated overlapping 2D images of 
the object acquired from different spaced apart viewpoints relative to the object, the 
35 apparatus comprising image processing means which is arranged to digitally process 
the 2D images to form a 3D representation which extends in a simulated 3D space in 
dependence upon both the mutual offset between correspondences of the respective 
2D images and a scaling variable, the scaling variable being representative of the 
separation between the viewpoints at which the 2D images were acquired. 



WO 99/60525 PCT/GB99/01556 



4 

In use the scaling variable is preferably entered by a user. 

For example if the actual camera separation is a and the partial 3D reconstruction is 
generated by virtual projectors with the same optical parameters as the camera(s) 

5 and having a separation of a 1 in simulated 3D space then the scaling factor could be 
a/a' to magnify the partial 3D reconstruction by a factor of a/a' and thereby generate 
a partial 3D reconstruction having the same size as the object. Such a partial 3D 
reconstruction will be able to be fitted to other life-size partial 3D reconstructions 
generated similarly from other pairs of images. In these embodiments, any value of 

10 scaling factor which will enable the partial 3D reconstructions to be fitted together 
will be satisfactory, and can for example be applied by the user during a process of 
fitting together the partial 3D reconstructions on-screen. 

In these embodiments the camera orientation (in the reference frame of the object) at 
15 each of the two viewpoints is preferably the same or nearly the same (eg ± 10 
degrees) and the or each partial 3D reconstruction is preferably generated by virtual 
projectors having the same orientation as the camera(s) and having optical centres on 
the line joining the optical centres of the camera(s). 

Preferably the method further comprises the step of acquiring the overlapping 2D 
images from a camera which is moved relative to the object between the different 
viewpoints, the net movement of the camera between the viewpoints not being fully 
constrained. 

For example the camera could be mounted on a fixed slide so as to move 
25 transversely to its optical axis (so that its orientation and movement along two axes 
is constrained but its movement along the third axis is not) or it could be mounted on 
a tripod (so that its movement in the vertical direction and rotation about a horizontal 
axis are constrained). 

30 However in a preferred embodiment the camera is hand-held. 

Optionally, the camera orientation can be measured with an inertial sensor eg a 
vibratory gyroscope and appropriate filtering and integrating circuitry as disclosed in 
our UK patent GB 2,292,60513 which is hereby incorporated by reference. A Kalman 
35 filter is presently preferred for filtering the inertial sensor output signals. 

Preferably the orientation of the camera is varied after acquiring a first image from 
one viewpoint and before acquiring a second image from the other viewpoint so as 
to maintain its orientation relative to the reference frame of the object when 
acquiring the second image. 
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One purely optical way of ensuring that the orientation of the camera is unchanged 
relative to the orientation at the first viewpoint is to vary the orientation until the 
projections in the image plane of the camera of the correlated points of the first 
imaoe and the corresponding points of the second image (which is preferably 
instantaneously displayed on screen) converge at a common point. In the special 
case in which the distance to the object from the image plane along the camera s 
optical axis is unchanged between the two viewpoints (in which case the above 
common point would be at infinity) then the orientation at the second viewpoint can 
be adjusted until the above projections are parallel. 

The present invention also relates to a method and apparatus for deriving a 
representation of the three-dimensional (3D) shape of an object from an image 
(referred to herein as an object image) of the projection of structured optical 
radiation onto the object surface. The term "structured optical radiation" is a 
generalisation of the term "structured light" and is intended to cover not only 
structured light but also structured electromagnetic radiation of other wavelengths 
which obeys the laws of optics. 

In principle the 3D shape of part of an object surface can be obtained by projecting 
structured light, eg a grid pattern onto a surface of the object, acquiring an image of 
the illuminated region of the object surface and identifying the elements of the 
structured light (eg the crossed lines of the grid pattern) which correspond to the 
respective features (eg crossed lines) of the image, assuming that the spatial 
distribution of the structured light is known. 

One such arrangement is shown by Hu & Stockman in "3-D Surface Solution Using 
Structured Light and Constraint Propagation" in IEEE Trans PAMI Vol 2 No 4 pp 
390 - 402, 1989, who discuss the advantages of the technique over stereoscopic 
imaging techniques. 

In a second aspect the invention provides a method of deriving a 3D representation 
of at least part of an object from a 2D image thereof, comprising the steps of 
illuminatina the object with structured projected optical radiation, acquiring a 2D 
imaoe of the illuminated object, correlating the 2D image with rays of the structured 
optical radiation, and digitally processing the 2D image to form a 3D representation 
which extends in a simulated 3D space in dependence upon both the correlation and 
a scalino variable, the scaling variable being representative of the separation between 
a location from which the structured optical radiation is projected and the viewpoint 
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at which the 2D image is acquired. 

In this aspect the invention also provides image processing apparatus for deriving a 
3D representation of at least part of an object from a 2D image of the illuminated 
5 object, the object being illuminated with structured optical radiation projected from a 
location spaced apart from the viewpoint at which the 2D image is acquired, the 2D 
image being correlated with the structured radiation, the apparatus comprising digital 
processing means arranged to form a 3D representation which extends in a simulated 
3D space in dependence upon both the correlation and a scaling variable, the scaling 
10 variable being representative of the separation between the location from which the 
structured optical radiation is projected and the viewpoint at which the 2D image is 
acquired. 

This aspect is related to the first aspect in that the separation of perspective centres 
of the cameras does not need to be known in the apparatus and method of the first 
aspect and the separation of theperspective centres of the projector and camera does 
not need to be known in the apparatus and method of the second aspect. There is a 
clear analogy between the camera-camera arrangement employed in the first aspect 
and the camera-projector arrangement employed in the second aspect. In each case, a 
3D representation can be derived from the intersection of two projections, in the one 
case representing the respective pencils of camera rays and in the other case 
representing the respective pencils of projector rays and camera rays. 

The calibration image can for example be of the projection of the structured optical 
radiation onto a calibration surface or can for example be a further object image 
obtained after moving the object relative to the camera used to acquire the initial 
object image and the projector means used to project the structured optical radiation. 

Preferably the first and second projections are from a baseline linking an origin of 
the structured optical radiation and a perspective centre associated with the image 
(eg the optical centre of the camera lens used to acquire the image), the 
reconstruction processing means being arranged to derive said baseline from two or 
more pairs of correlated features. This feature is illustrated in Figures 3 and 13 
discussed in detail below. 

35 In one embodiment the image processing means is arranged to generate 
correspondences between two or more calibration images and to determine the 
spacing between origins of the first and second projections in dependence upon both 
the correspondences of the two or more calibration images and input or stored metric 
information associated with the calibration images. This feature is illustrated in 



15 



20 
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Figure 15, discussed in detail below. 

In another embodiment the reconstruction processing means is arranged to vary the 
spacing between the origins of the first and second projections in dependence upon a 
5 scaling variable enterable by a user. In this embodiment a further calibration image 
is not required. Preferably the apparatus includes means for displaying the 3D 
representation with a relative scaling dependent upon the value of the scaling 
variable. 

In a further aspect the invention provides a method of generating a 3D representation 
of an object from an object image of the projection of structured optical radiation 
onto the object surface and from at least one calibration image of the projection of 
the structured optical radiation onto a surface displaced from the object surface, the 
15 method comprising the steps of: 

i) correlating at least one calibration image with the object image and optionally with 
a further calibration image; 

20 ii) simulating a first projection of the object image and a second projection of the 
structured optical radiation, and 

iii) deriving said 3D representation from the mutual intersections of the first and 
05 second projections. 

In a related aspect the invention provides image processing apparatus for generating 
a 3D representation of at least part of an object from an object image of the 
projection of structured optical radiation onto the object surface and from at least 

30 one calibration image of the projection of the structured optical radiation onto a 
surface displaced from the object surface, the apparatus comprising image 
processing means arranged to generate correspondences between at least one 
calibration image and the object image and optionally a further calibration image, 
and reconstruction processing means arranged to simulate a first projection of the 

35 object image and a second projection linking respective correspondences of at least 
two of the correlated images and to derive said 3D representation from the mutual 
intersections of the first and second projections. 
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Preferred features of the invention are defined in the dependent claims. 

In another aspect the invention provides image processing apparatus for deriving a 
3D representation of at least part of an object from a 2D image thereof, the object 
being illuminated with structured optical radiation projected from a location spaced 
apart from the viewpoint at which the 2D image is acquired, the 2D image being 
correlated with rays .of the structured radiation, the apparatus comprising digital 
processing means arranged to form a 3D reconstruction which extends in a simulated 
10 3D space in dependence upon both the correlation and a scaling variable, the scaling 
variable being representative of the separation between the location from which the 
structured optical radiation is projected and the viewpoint at which the 2D image is 
acquired. 

15 This aspect of the invention is illustrated in Figure 13. Following a simple 
calibration procedure requiring no knowledge of the position of the camera or the 
projector relative to the object it enables a 3D representation to be generated. This 
can optionally be displayed and scaled or it can be distorted eg for special effects in 



Preferably the apparatus is arranged to derive a further 3D representation from a 
further 2D image acquired from a different viewpoint relative to the object, the 
combining means being arranged to combine the first-mentioned 3D representation 
25 and the further 3D representation by manipulations in a simulated 3D space 
involving one or more of rotation and translation, the apparatus further comprising 
scaling means arranged to reduce or eliminate any remaining discrepancies between 
the 3D reconstructions by scaling one 3D reconstruction relative to the other along at 
least one axis. 



Preferably the apparatus is arranged to display both 3D representations 
simultaneously and to manipulate them in simulated 3D space in response to 
commands entered by a user. 

In one embodiment the apparatus is arranged to perform the manipulations of the 3D 
reconstructions under the control of a computer pointing device. 

Preferably the apparatus includes means for combining two or more 3D 



20 



graphics and animation. 



30 
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representations and means for adjusting the relative scaling of the representations to 
enable them to fit each other. 

In other embodiments the variable will correct for one or more distortions of the 
5 partial 3D reconstruction either laterally or in the depth direction (curvature of field) 
which, as shown below in connection with Figures 8 to 11 can arise from incorrect 
positioning of the virtual projectors eg a misalignment relative to the camera 
viewpoints. 

In certain embodiments the partial 3D reconstruction will be distorted by delibarately 
10 misaligning one or both the virtual projectors relative to the camera viewpoints or 
camera and projector viewpoints. 

Such a feature is useful in the fields of design, graphics and animation. 

15 Some difference in orientation can be tolerated and the resulting distortion in the 3D 
reconstruction subsequently corrected, as will be shown below. 

The smaller the difference in orientation, the smaller the distortion in the 3D 
reconstruction. Ideally the orientation of the camera is maintained unchanged 
20 between the two viewpoints. 

Preferably the angle subtended by a pair of correlated features at the corresponding 
feature of the object is 90 degrees ± 30 degrees (more preferably ±10 degrees). 
Ideally the subtended angle is exactly 90 degrees. This feature enables any distortion 
25 resulting from a slight change in orientation to be corrected more accurately. 

Following the partial reconstruction by the above method, complementary 3D 
reconstructions of different parts of the object obtained similarly from further sets of 
overlapping images can be fitted together. 

30 

To the extent that the reconstruction is distorted then simple compensations for 
distortion parallel to the image plane and for curvature of field can be applied in 
order to enable the 3D reconstruction to be fitted. 



35 



Preferably a view of the reconstruction in the simulated 3D space is displayed eg on 
a screen and the variable is varied by the user in response to the view displayed on 
screen. 
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Preferably the image processing means is arranged to generate said correspondences 
of said images by comparing local radiometric distributions of said images. 

In a related aspect the invention provides a method of deriving a 3D representation 
of at least part of an object from a 2D image thereof, comprising the steps of 

5 illuminating the object with structured projected optical radiation, acquiring a 2D 
image of the illuminated object, correlating the 2D image with rays of the structured 
optical radiation, and digitally processing the 2D image to form a 3D reconstruction 
which extends in a simulated 3D space in dependence upon both the correlation and 
a scaling variable, the scaling variable being representative of the separation 

10 between a location from which the structured optical radiation is projected and the 
viewpoint at which the 2D image is acquired. 

Suitable algorithms for correlating (generating correspondences between) 
overlapping images are already known - eg Gruen's algorithm (see Gruen, A W 

15 "Adaptive least squares correlation: a powerful image matching technique" S Afr J of 
Photogramrnerry, remote sensing and Cartography Vol 14 No 3 (1985) and Gruen, 
A W and Baltsavias, E P "High precision image matching for digital terrain model 
generation" Int Arch photogrammetry Vol 25 No 3 (1986) p254) and particularly the 

OQ "region-growing" modification thereto which is described in Otto and Chau 
"Region-growing algorithm for matching terrain images" Image and Vision 
Computing Vol 7 No 2 May 1989 p83, all of which are incorporated herein by 
reference. 

25 Essentially, Gruen's algorithm is an adaptive least squares correlation algorithm in 
which two image patches of typically 15 x 15 to 30 x 30 pixels are correlated (ie 
selected from larger left and right images in such a manner as to give the most 
consistent match between patches) by allowing an affine geometric distortion 
between coordinates in the images (ie stretching or compression in which originally 

30 parallel lines remain parallel in the transformation) and allowing an additive 
radiometric distortion between the grey levels of the pixels in the image patches, 
generating an over-constrained set of linear equations representing the discrepancies 
between the correlated pixels and finding a least squares solution which minimises 

35 the discrepancies. 



The Gruen algorithm is essentially an iterative algorithm and requires a reasonable 
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approximation for the correlation to be fed in before it will converge to the correct 
solution. The Otto and Chau region-growing algorithm begins with an approximate 
match between a point in one image and a point in the other, utilises Gruen's 
algorithm to produce a more accurate match and to generate the geometric and 
radiometric distortion parameters, and uses the distortion parameters to predict 
approximate matches for points in the region of the neighbourhood of the initial 
matching point. The neighbouring points are selected by choosing the adjacent 
points on a grid having a grid spacing of eg 5 or 10 pixels in order to avoid running 
Gruen's algorithm for every pixel. 

Hu et al "Matching Point Features with ordered Geometric, Rigidity and Disparity 
Constraints" IEEE Transactions on Pattern Analysis and Machine Intelligence Vol 
16 No 10, 1994 ppl041-1049 (and references cited therein) discloses further 
methods for correlating features of overlapping images. 

Since the above algorithms were developed for generating correspondences between 
images having poorly defined features (eg aerial photographs) whereas the 
projection of structured light onto an object surface will generate distinct local 
radiometric distributions, the problem of correlation is less critical in the context of 
the present invention. Accordingly the precise correlation algorithm is not critical. 
However we have found a number of improvements to the Gruen algorithm, as 
follows: 

i) the additive radiometric shift employed in the algorithm can be dispensed with; 

ii) if during successive iterations, a candidate matched point moves by more than a 
certain amount (eg 3 pixels) per iteration then it is not a valid matched point and 
should be rejected; 

iii) during the growing of a matched region it is useful to check for sufficient 
contrast at at least three of the four sides of the region in order to ensure that there is 
sufficient data for a stable convergence - in order to facilitate this it is desirable to 
make the algorithm configurable to enable the parameters (eg required contrast) to 
be optimised for different environments, and 

iv) in order to quantify the validity of the correspondences between images it has 
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been found useful to re-derive the original grid point in the starting image by 
applying the algorithm to the matched point in the other image (ie reversing the 
stereo matching process) and measuring the distance between the original grid point 
and the new grid point found in the starting image from the reverse stereo matching. 
5 The smaller the distance the better the correspondence. 

It is not necessary (and indeed in many cases it will be computationally inefficient) 
to correlate all possible features prior to determining the viewpoints of the camera(s) 
relative to the object. It will usually be simpler to derive the remaining correlations 
IQ between features of the respective images once the viewpoints have been 
determined from a small number eg eight pairs of correlated features, by searching 
for further correlated features along epipolar lines determined from the viewpoint 
determination . 

Once the correlations and positions and orientations of the cameras are known, the 
3D configuration of the object is obtainable by projecting each image from (eg a 
virtual) projector having the same focal length and viewpoint (position and 
orientation) as the camera which acquired that image. The principal rays from 
corresponding features of the respective images will intersect in (virtual) 3D space at 
the location of the object feature. 

20 

Accordingly, in another aspect the invention provides a method of generating a 3D 
reconstruction of an object comprising the steps of projecting images of the object 
acquired by mutually aligned cameras into simulated 3D space from aligned virtual 
projectors, the separation of the virtual projectors being variable by the user. 

25 

In this aspect the invention also provides apparatus for generating a 3D 
reconstruction of an object comprising two aligned virtual projector means arranged 
to project images of the object acquired by mutually aligned cameras into simulated 
3D space, the separation of the virtual projectors being variable by the user. 

30 

Preferably the difference in alignment of the virtual projectors (ie the angle between 
them) is less than 45 degrees, more preferably less 30 degrees, and is most 
preferably less than 20 degrees, eg less than 10 degrees. Desirably this angle is less 
35 than 5 degrees, eg less than one degree. This feature enables overhanging features of 
the object to be captured from both camera viewpoints and also facilitates the 
determination of the line connecting the optical centres of the camera(s) at the 
different viewpoints as well as the correlation of features between the images. 
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Preferably the origin of each projection is located on a line in simulated 3D space 
connecting the corresponding optical centres of the camera at the two viewpoints. As 
explained below with reference to Figure 3, this will result in a scaled partial 3D 
reconstruction of the object. 

In one embodiment a distortion parameter is entered by the user and applied to the 
3D reconstruction. For example, after projecting the overlapping 2D images from 
their nominal viewpoints such that the projections intersect to form an initial 3D 
reconstruction in simulated 3D space, the initial 3D reconstruction can be rotated 
whilst constraining the features of the initial 3D reconstruction which are generated 
from the intersecting projections of correlated features of the projected images to lie 
on the projections of those features from one of the 2D images, thereby forming a 
further 3D reconstruction. As is illustrated in Figure 9 (discussed below) this can be 
used to generate a reconstruction which is parallel to, and therefore a scaled replica 
of, the actual object surface. The above-mentioned rotation and constraint correct for 
the lateral distortion caused by the lack of parallelism of the initial 3D reconstruction 
and the 3D reconstruction generated from correctly aligned virtual projectors. 

However as noted above, in many applications it will be desirable to distort the 3D 
reconstruction relative to the original object in order to achieve a desired artistic 
effect. 

It should be noted that the aspect of the invention concerning the derivation of a 3D 
reconstruction is applicable to previously acquired images in conjunction with their 
associated pairs of correlated points, however acquired. 

Preferably the viewpoints are calculated from at least two (desirably at least three) 
pairs of correlated features and the 3D reconstruction of the object is generated in 
dependence upon said calculation of the viewpoints, the calculation of the 
viewpoints being performed on fewer than all derivable pairs of correlated features. 

This preferred feature of the invention is illustrated in Figure 3 (discussed below) 
which shows the derivation of the line connecting the viewpoints from the pencil of 
projections from three pairs of correlated features. Since the third projection from 
the third pair of correlated features P3 and P3' intersects the point VP already 
defined by the other two projections' intersection (and thus merely onfirms the 
unchanged orientation between the viewpoints) it is not strictly necessary to find the 
straight line joining the viewpoints. Hence this line can be found from just two pairs 
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of correlated features if there is no change in camera orientation. However greater 
accuracy will be obtained if more than two pairs of correlated features are processed. 

Preferably said calculation is performed on fewer than one thousand pairs of 
5 correlated features, more preferably fewer than one hundred pairs of correlated 
features, desirably fewer than fifty pairs of correlated features eg eight or fewer 
pairs.. For example the calculation can be performed on four, three or two pairs of 
correlated points. 

10 Particularly when the viewpoints are parallel or nearly parallel (eg within 10 degrees 
of each other) the above preferred features result in a greater or lesser degree of 
economy of processing. 

Further preferred features of the invention are defined in the dependent claims. 

Preferred embodiments of the invention are described below by way of example only 
with reference to Figures 1 to 20 of the accompanying drawings, wherein: 

20 Figure 1 is a diagrammatic view of one apparatus in accordance with the two- 
camera aspects of the invention; 

Figure 2 is a flow diagram of one method in accordance with the two-camera aspects 
of the invention; 

25 

Figure 3 is a ray diagram showing the relationship between the object, camera and 
projector viewpoints, virtual projector viewpoints and partial 3D reconstruction in 
one embodiment of the invention; 

30 

Figure 4 is a ray diagram in 3D showing a derivation in accordance with one aspect 
of the invention of the direction of movement of the camera from the acquired 
images in the method of Figure 2 and apparatus of Figure 1; 

35 Figure 5 is a ray diagram in 3D showing a derivation of the direction of movement of 
the camera from the acquired images in the method of Figure 2 and apparatus of 
Figure 1 in the special case in which the camera does not move relative to the object 
in the Z direction; 
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Figure 6 is a diagram showing the movement of the image of the object in the image 
plane 1 of Figure 5; 

5 Figure 7 is a flow diagram summarising the image processing steps utilised in the 
method of Figure 2 and the apparatus of Figure 1 ; 

Figure 8 is a 2D ray diagram illustrating the curvature of field resulting from 
misalignment of one virtual projector relative to the other in the apparatus of Figure 
10 1 and method of Figure 2; 

Figure 9 is a 2D ray diagram illustrating correction of distortion of the partial 3D 
reconstruction in an embodiment of the invention; 

Figure 10A is a 2D ray diagram illustrating the curvature of field resulting from a 
misalignment of a virtual projector by 5 degrees; 

Figure 10B is a ray diagram illustrating the curvature of field resulting from a 
20 misalignment of a virtual projector by 10 degrees; 

Figure 10C is a ray diagram illustrating the curvature of field resulting from a 
misalignment of a virtual projector by 15 degrees; 

25 Figure 1 1 is a plot of curvature of field:misalignment in the arrangements of Figures 
10A to 10C; 

Figure 12 is a schematic representation of one apparatus in accordance with 
projector-camera aspects of the invention; 

Figure 13 is a sketch perspective ray diagram showing one optical arrangement of 
the apparatus of Figure 12; 

35 Figure 14 shows an object image and a calibration image acquired by the apparatus 
of Figures 12 and 13 and the correlation of their features; 



Figure 15 is a sketch perspective ray diagram showing a variant of Figure 13 in 
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which two reference surfaces are used to locate the camera and projector of the 
apparatus of Figures 12 on the baseline connecting their respective perspective 
centres; 

Figure 16 is a flow diagram illustrating a method of operation of the apparatus of 
Figures 12 and 13 in accordance with a projector-camera aspect of the invention; 

Figure 17 is a screenshot illustrating the fitting together of two 3D surface portions 
of the object using the apparatus of Figures 1 and 2 or the apparatus of Figure 12; 

Figure 18 is further screenshot showing the scaling of the resulting composite 3D 
surface portion along vertical and horizontal axes; 

Figure 19 is a further screenshot showing the scaling of intersecting 3D surface 
portions to fit each other, and 

Figure 20 is a screenshot showing a user interface provided by the apparatus of 
Figures 1 and 2 or the apparatus of Figure 12 for manipulating the images and 3D 
surface portions. 

Referring to Figure 1, the apparatus comprises a personal computer 4 (eg a 

Pentium® PC) having conventional CPU, ROM, RAM and a hard drive and a 
connection at an input port to a digital camera 1 as well as a video output port 
connected to a screen 5 and conventional input ports connected to a keyboard and a 
mouse 6 or other pointing device. The hard drive is loaded with conventional 
operating system such as Windows®95 and software: 

a) to display images acquired by the camera 1; 

b) to correlate points in overlapping regions of images input from the' camera 1; 

c) to derive the line of movement of the camera from the acquired images; 

d) to project images acquired by the camera into a simulated 3D space from virtual 
projectors located on the line of movement at a separation selected (with the 
keyboard or pointing device) by the user (thereby creating a partial 3D 
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reconstruction); 

e) to scale the partial 3D reconstruction along one or more axes and combine such 
partial 3D reconstructions as illustrated in Figures 12 and 13, and 

0 to apply or compensate for lateral distortion and curvature of field as illustrated in 
Figure 9 and Figures 10A to 10C. 

The software to carry out function a) can be any suitable graphics program and the 
10 software to carry out function b) can be based on the algorithms disclosed in Hu et al 
"Matching Point Features with ordered Geometric, Rigidity and Disparity 
Constraints" IEEE Transactions on Pattern Analysis and Machine Intelligence Vol 
16 No 10, 1994 ppl041-1049 (and references cited therein). One suitable algorithm 
is the Gruens algorithm. 

15 

Referring to Figure 1, the camera CM, a Pulnix M-9701 progressive scan digital 
camera, which may be hand-held and carrying a 3-axis vibratory gyroscope G with 
associated filtering circuitry (a Kalman filter is presently preferred) and integrating 
-> 0 circuitry to generate and display on screen 3-axis orientation signals or (as shown at 
V) may be mounted on a tripod T or other support, for example, defines a first set of 
axes x, y, z in its initial viewpoint (ie position and orientation) and is used to acquire 
an image of the object 3 and to display the image on screen 5. The origin of the x y z 
coordinate system is taken to be the optical centre of the camera lens. The camera is 
25 then moved an arbitrary distance to a new position CM' and a second image of 
object 3 is acquired with the orientation of the camera (relative to the x y z 
coordinate system) maintained unchanged. This unchanged orientation can be 
achieved with the aid of a suitable support if necessary. A method of checking for 
changes in orientation, based on the convergence of the projections from 
corresponding points of the two images in the image plane I of the camera, will 
subsequently be described with reference to Figure 4. Only a few (eg 3, 4, 5 or up to 
eg 10) pairs of correlated points need to be found for this purpose and can be derived 
visually by the user from the images displayed on screen or by the software in 
computer 4. As will become apparent from Figure 4, the camera movement (ie the 
line in the xyz coordinate system joining the optical centre of the camera lens in 
its two positions) can also be found at his stage. In some cases however in which a 
highly accurate reconstruction is not required, the direction of camera movement can 
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be estimated by the user. 

The remaining corresponding points in the two images are then correlated by the 
computer 4 (preferably taking advantage of the information on camera movement 
obtained in the previous step, eg by searching along the epipolar line in one image 

5 corresponding to a point of interest in the other image) and a partial 3D 
reconstruction of the object 3 in simulated 3D space is generated from the correlated 
points as will be described below with reference to Figure 3. Since the distance 
between the positions 1 and 1' is not known, a parameter representing this distance 
is entered by the user, either form the keyboard or from the pointing device 6 for 

10 example. The computer 4 is programmed to display the resulting 3D reconstruction 
on screen and the user can vary the parameter interactively in order to arrive at a 
partial 3D reconstruction in the x y z coordinate system which bears a desired 
relationship to the actual object 3. Typically this will be stretched or compressed in 
the direction of movement between positions CM and CM', relative to the actual 
object. 

Depending on the accuracy with which the camera orientation is maintained in the 
two positions (eg with the aid of orientation signals from gyroscope G) and the 
20 accuracy of the estimated or optically derived camera movement there may be 
distortions in the partial 3D reconstruction which can be corrected at this stage. The 
correction of such distortion (and indeed the deliberate addition of such distortion 
when this is desired) will be described subsequently with reference to Figures 9 and 
10A to 10C. 



A further partial 3D reconstruction is then generated by moving the camera CM to a 
new viewpoint CMA such that the object lies in a region of overlap ie such that at 
least one point P in the camera's field of view at viewpoints CM and CM* remains in 
the camera's field of view at viewpoint CMA. This movement can be represented by 

a rotation 6 about the y axis (resulting in new axes xi, yi and zi) followed by a 

rotation <|> about the xi axis (resulting in new axes x', y' and z*) followed by a 

rotation k about the z 9 axis, followed by translations AX, AY and AZ along the 

resulting axes. Therotation k about the z' axis will in many cases be zero, as shown 
in Figure L 



25 



WO 99/60525 PCT/CB99/01 556 



10 



15 



20 



25 



30 



35 



19 

After acquiring an image at the viewpoint CMA the camera is moved to a new 
position 1A' and a further image is acquired and displayed (the orientation of the 
camera being adjusted to remain the same as at viewpoint CMA). A further partial 
3D reconstruction of object 3 is then performed by the computer 4 in a manner 
analogous to that described above in connection with viewpoints CM and CM'. 

If it is assumed that the origin of the x\ y\ z* coordinate system is shifted to the 
optical centre of the camera at viewpoint CMA to give a new coordinate system X, 
Y, Z then the relationship between any point XYZ in the new XYZ coordinate 
system and the same point xyz in the xyz coordinate system is given by: 



■siniccos0 coskcos<|> + sinKsin6sin<J) cosicsin<|> - sintcsin6cosq> 
sin8 -cos0sin<p cos8cos<J> 
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The terms in the 3x3 matrix can be shown to be the cosines of the angles between the 
axes of the XYZ and xyz coordinate systems (see eg BozsMathematical Methods in 
the Physical Sciences Pub John Wiley & Sons, 2nd Edn pp437 and 438). 

Accordingly the partial 3D reconstructions can be transformed to a common 
coordinate system and elongated/compressed along all three axes to minimise 
discrepancies between them in their region of overlap and to build up a more 
complete reconstruction of the object 3. 

In the present embodiment this process is carried out under the control of a user by 
displaying the partial 3D reconstructions on screen and varying their relative 
elongation/compression along three axes, as will be described in ore detail with 
reference to Figure 19. 

In another embodiment the partial 3D reconstructions are combined without user 
intervention using the Iterative Closest Point algorithm. This algorithm is publicly 
available and therefore it is not necessary to describe it in detail. Briefly however, it 
registers two surface edges by searching for a group of (say) the ten closest pairs of 
points on the respective edges. The surfaces are then repositioned to minimise the 
aggregate distance between these pairs of points and then a new group of closest 
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pairs of points is found. The repositioning step is then repeated. Other methods of 
correlating 3D surface regions are disclosed in our GB 2,292,605B. In accordance 
with one aspect of the present invention, a scaling factor or other variable is 
generated either by the user or iteratively under software control in order to adjust 
the relative sizes of the partial 3D reconstructions to ensure they can be fitted 
together into a self-consistent overall surface description of the object. 

Returning to the description of the embodiment of Figures 1 and 2, further partial 3D 
reconstructions can be derived similarly from the other sides of the object 3 and 
combined with each other and/or with the existing combination of partial 3D 
reconstructions until a complete 3D representation of the object is achieved. 



The method is summarised in Figure 2. 

Overlapping images are captured (step S10), pairs of points (or larger features eg 
lines or areas) are correlated (step S20) and at least the approximate camera 
movement between the viewpoints is determined (step S30). In the preferred 
embodiment this is determined with a high degree of accuracy by processing the two 
images. Partial 3D reconstructions of the object surface are then generated in a 
simulated 3D space using the computer 4 to process the correlated pairs of points 
and camera movement (step S40) and these are combined, preferably interactively 
on screen by the user to give a consistent but possibly distorted (eg compressed or 
elongated) 3D representation (step S50). Optionally, this is distorted or undistorted 
by applying appropriate compression or elongation (step S60). 



30 



In an alternative embodiment, steps S30 and S40 can be combined in a matrix 
processing method, eg using the commercially available INTERSECT program 
produced by 3D Construction Inc. 

Step S40 will now be described in more detail with reference to Figure 3. 

Figure 3 shows the object 3 in the field of view of the camera CM at positions CM 
35 and CM'. Principal ray lines from points Qa, Qb and Qc on the surface of the object 
pass through optical centres Oc and Oc' of the camera in the respective positions 
CM and CM* and are each imaged on the image plane. The pair of points thus 
formed on the image plane from each of points Qa, Qb and Qc (and many other 
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points, not shown) are corresponding points and can be correlated by known 
algorithms. 

Accordingly it will be appreciated that if virtual projectors prl and pr2 with the same 
5 orientation, focal length and other optical characteristics as the camera at its 
respective viewpoints were placed at positions CM and CM' then they would 
project ray lines into virtual 3D space from the correlated pairs of points which 
would intersect at the true locations of the corresponding points Qa, Qb, Qc... and 
all other points on the object surface corresponding to a correlated pair of image 
10 points. By a virtual projector is meant any operator on the image which behaves as 
an optical projector of the image. A suitable software routine for processing the 
image in the required manner could be written without difficulty be persons skilled 
in the art. 

Although the projectors are shown in Figure 3 with the same orientation in the 
reference frame of the object (corresponding to the common camera orientation) this 
is merely a preferred feature which enables the vector V to be determined more 
easily by the image processing method disclosed in Figure 4. In principle the above 
analysis is also applicable to virtual projectors of different orientations, 
corresponding to respective, different camera orientations. As noted above, the 
camera orientations can be determined by an inertial sensor and integrating circuitry 
similar to that disclosed in our above-mentioned UK patent GB 2,292,605B. 

If the vector V joining the optical centres of the camera/projector lenses in positions 
CM and CM' is extended and projector pr2 is moved along this vector with its 
orientation unchanged (eg to position pr2' as shown) and the same ray lines are 
projected from it then these will still intersect the ray lines from projector prl but at 
different positions in simulated 3D space. For example the intersections at Qa, Qb 
and Qc will be replaced by intersections at Qa\ Qb' and Qc' respectively. This is 
because any ray line from Op 1 parallel to a ray line from Op will lie in the same 
plane as a ray line from Oq intersecting the ray line from Op and therefore intersect 
that ray line from OC- For example line Op'Qa* lies in the same plane as triangle 
OcOc'Qa and wil1 therefore intersect line OcQa, in this case at Qa'. 

Hence a scaled representation 3/3' can be generated by any pair of virtual projectors 
on vector V, if V is parallel to the line joining the optical centres of the camera lens 
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at the two positions at which the projected images are acquired and the projectors 
have the same orientation(s) as the camera. This last condition can be satisfied even 
if the correct orientation is initially unknown, namely by adjusting the orientation 
about any axis perpendicular to vector V until the respective ray lines from any pair 
5 of correlated points intersect. This procedure can be carried out either manually be 
the user with the aid of a suitable display of the images and ray lines on screen or 
automatically in software. 

Accordingly a 3D representation of the object can be generated once the direction of 
10 vector V is known. Figure 3 will be referred to again in connection with a very 
similar method applicable to the virtual projectors associated with the projector- 
camera aspects of the invention. 

15 Figure 4 illustrates one derivation of vector V (step S40 in Figure 2). For ease of 
understanding, the camera (having a lens with optical centre O) is considered to be 
stationary and the object 3 is considered to move to position 3' along line M. Points 
PI, P2 and P3 are imaged as points pi, p2 and p3 when the object is in position 3 
and as points pl\ p2' and p3' when the object is in position 3'. If and only if the 

20 orientation of the camera relative to the object frame is unchanged between the two 
positions, then the projections of the lines LI, L2 and L3 joining Pn and pn' (n = 1 to 
3 in this illustration but in general will be much larger) will meet at a common point 
VP, which is somewhat akin to the vanishing point in perspective drawing. 

25 It will be seen that the line joining VP to the optical centre O is the vector V which is 
the locus of the optical centre of the camera (or desired locus of the virtual projector) 
relative to the object. 

3Q In general, if the camera is hand-held, the lines LI, L2, L3....Ln connecting the 
correlated points will not meet at a common point, owing to a change in orientation 
of the camera between the two positions. However the orientation of the camera can 
be varied about the X, Y and Z axes by the user at the second position whilst 
displaying the lines LI, L2, L3...Ln on screen and the second image captured only 

35 when these lines converge on or near a common point, as determined either visually 
or by a software routine. Indeed the image can be captured, under control of the 
software routine, only when the necessary convergence is achieved. 
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If gyroscope G is used then its 3-axis orientation signals can be used to maintain the 
orientation of the camera between the two viewpoints. 

It should be noted that if the camera is considered to be moving and the object to be 
5 fixed, then the movement of the camera from its original position is derived by 
superimposing the image plane and associated image of the camera in its new 
position on the image plane (and image) of the camera in its first position, projecting 
the lines LI, L2, L3....Ln in the image plane of the camera in its first position and 
connecting the resulting point of intersection VP to the optical centre of the camera 
10 in its first position. The resulting vector V is the movement of the object in the 
coordinate frame of the camera at its first position. 

In a variant of the above method, a moving image or a rapid succession of still 
images are acquired by the camera as it moves and the (assumed) rectilinear 
movement of the camera between each still image or between sequential frames of 
the moving image is derived by projecting the lines LI, L2, L3....Ln in the image 
plane of the camera to derive the point VP for each incremental movement of the 
camera. The resulting vector V for each incremental movement of the camera will 
change direction as the direction of movement of the camera moves and the 
segments can be integrated to determine the overall movement (including any 
change in orientation) of the camera. 

It should be noted that when there is little or no movement of the camera relative to 
25 the object in the depth (Z) direction then lines LI, L2, L3...Ln will be parallel or 
nearly so and VP will be at infinity or too far away to be determined with accuracy. 

This special case is illustrated in Figures 5 and 6. Points A, B and C at corners of 
object 3 in its initial position are imaged by lens L as triangle abc in the image plane 
I of the camera. When the object moves to a new position 3' these corners A', B' 
and C are imaged as triangle a'bV which will be similar (in the narrow geometrical 
sense) to triangle abc but may for example be rotated. For reasons whch are 
explained below, it will be assumed initiallly that points A, B and C lie in a plane 
35 which is substantially parallel to the image plane I. 
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It should be noted that the images abc and a'bV would also be obtained from a 
smaller object of the same shape correspondingly nearer the camera, as illustrated by 



WO 99/60525 PCT/GB99/01 556 



24 

face A1B1CL 

Referring now to Figure 6 which shows the object 3/3' (not its image) as seen by the 
camera, various possible faces ABC, A1B1C1 and A2B2C2 are shown. There will 
5 be a continuous range of possible sizes for face ABC; for the sake of clarity only the 
above three are shown. However the possible faces all have a common centroid P. 

When the object moves to a new position 3' illustrated by face A'B'C the centroid 
will move to a new position Q and the line PQ, which will be parallel to the image 
10 plane I (Figure 5) will represent the direction of movement irrespective of the true 
size and distance of the object. In fact this will remain true even if there is some 
rotation about the Z axis (such that the lines AA\ BB' and CC are not parallel). 

However if the camera is rotated at the second position about the Z axis to ensure 
that the above lines AA\ BB' and CC are in fact parallel, then the above analysis 
holds true even if the points A, B and C do not lie in a plane parallel to the image 
plane I; for example if the image abc is derived from points A1BC in Figure 5. The 
corresponding centroid (not shown) of A1BC in Figure 6 is displaced from centroid 
20 P and the corresponding centroid of ATB'C after movement of the object to 
position 3* is similarly displaced from centroid Q. However the line joining these 
new centroids will be parallel to line PQ and will therefore correctly indicate the 
direction of movement of the object relative to the camera. 

25 Even if there is some movement of the camera in the Z direction such that lines AA\ 
BB' and CC are not parallel but converge to a distant point, line PQ will still 
correctly represent the direction of movement of the object 3. 

Thus the above method overlaps to some extent with the method of Figure 4. 

30 

The overall method of determining the direction of movement of the camera is 
illustrated in Figure 7. A first image is captured (step Sll) and the camera is moved 
to a new position with the object still in its field of view and the first image is 
35 displayed on screen 5 (Figure 1), superimposed on the instantaneous image seen by 
the camera at the new position (step S12). Corresponding points are correlated, 
either by eye (ie by the user) or by a suitable software routine (step S13). Only a 
small number of points need to be correlated at this stage, eg 100 or fewer, eg 10, 
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depending on the processing speed and the accuracy required. 

If it appears to the user that there has been movement in the Z direction the method 
branches to step S14 at which the orientation of the camera is adjusted about the X, 
5 Y and Z axes until the correlated points converge to a common ''vanishing point" VP 
(Figure 4). At this stage the second image is captured and stored in memory. 

In step S15 a line is projected from the point VP through the optical centre of the 
camera lens to find the locus V (Figure 4) and with the aid of this information, 
10 further correlation of the images is performed and a partial 3D reconstruction is 
generated by the moethod illustrated in Figure 3 (step S17). 

Preferably, step S15 also involves or is preceded by the selection of a camera model 
(eg from a list displayed on screen) by the user. The computer 4 is programmed to 
store a list of camera models Ml, M2....Mn in association with the parameters of 
each model needed for the calculation of the required projection and other 
processing. The following parameters may be stored: 

20 Camera model Mn: 

pixel size in x direction 
pixel size in y direction 
focal length 
■ x dimension of film plane (in pixels) 
y dimesnion of film plane (in pixels). 

If it appears to the user (eg as a result of a failure to find a. reasonably close 
3Q "vanishing point" VP) that there has not been significant movement in the Z 
direction then the movement of the object is assumed to be parallel to the image 
plane and the line PQ is found from the images of a group of eg three or more points 
(not necessarily corners) preferably lying in or close to a plane parallel to the image 
plane (step S16) before proceeding to step S17. 

35 

If only because most digital cameras currently on the market have a limited 
resolution (eg 1024 x 768 pixels, 640 x 480 pixels or even less) there will almost 
inevitably be some error in the determination of the camera movement and hence in 
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the relative placement and orientation of the virtual projectors PR1 and PR2 in 
Figure 3. The effect of this error is illustrated in Figure 8. 

Figure 8 is a ray diagram orthogonal to the image plane showing the imaging of a 
line of points PI, P2 and P3 on the surface of the object 3 by a camera at positions 
CM and CM*. For the sake of clarity the lens L is shown only at the first position 
CM. It will be noted that the orientation of the camera is the same at the two 
positions. 

Virtual projectors prl and pr2 would correctly reconstruct the object as 
reconstruction 30 if located at the above positions with the above orientation. If 
projector pr2 were moved to position pr2' along the direction of movement V 
(Figure 4) of the cameras then the resulting reconstruction 30' would still show the 
points (Pl\ P2' and P3') on a straight line with no curvature of field, with correct 
scaling in the lateral direction (ie the length ratio P1P2/P2P3 = PrP27P2'P3*). 

If the projector is successively rotated to a new positions pr2" and pr2*" then 
successively greater curvature is introduced into the reconstruction 30" or 30'", as 
shown. The relationship between the above curvature of field of the reconstruction 
30 and the angular error in the orientation of the second virtual projector relative to 
the first is considered below in relation to Figures 10A to 10C. 

However, referring to Figure 9, in which the respective ray lines from the correlated 
points of the images projected by projectors prl amd pr2 are substantially 
orthogonal, it can be seen that an error in the relative orientation of the projectors, 
resulting in the projection of the second image from projector pr2\ results in a 
negligible curvature of field: line 30' defined by the intersection of the ray lines 
from pr2' with the corresponding ray lines from projector prl is substantially 
straight, like line 30 which is defined by the corresponding intersections of the ray 
lines from correctly co-oriented projectors prl and pr2. Accordingly it is a preferred 
feature of the invention that the camera positions are so chosen that the angle 
subtended by the ray lines from a pair of correlated points at their intersection at 
their corresponding point at the object surface is substantially 90 degrees, eg 90 
degrees ± 30 degrees, preferably 90 degrees ± 20 degrees, most preferably 90 
degrees ± 10 degrees. This minimises the correction required due to curvature of 
field. 
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There remains the lateral distortion of the reconstruction, ie the discrepancy between 
the ratios P1P2/P2P3 and P1'P27P2'P3' (Figure 8). If however the partial 3D 
reconstruction 30' is rotated to position 30C so as to lie parallel to the image plane 

5 of projector PR1, as shown in Figure 9, and all its points are shifted so as to to be 
intercepted by the ray lines from projector prl, then the triangles defined by these 
shifted points and the ray lines from prl will be geometrically similar to the triangles 
defined by the corresponding points on line 30 and these ray lines, and hence the 
distribution of points along line 30C will be a scaled replica of the corresponding 

10 distribution along line 30. It will be understood that this is so irrespective of the 
angle subtended by the ray lines at their intersection at their corresponding point at 
the object surface. 

Accordingly, distortion parallel to the image plane can be corrected for (or 
deliberately applied) by rotating the partial 3D reconstruction about an axis 
perpendicular to the image plane of a projector used to generate that partial 
reconstruction whilst constraining the points in the partial 3D reconstruction to lie on 
their ray lines from that projector. 

20 

Figures 10A to 10C illustrate the curvature of field resulting from angular 
misalignment of one projector pr2' from its correct orientation pr2. In each case the 
correct reconstruction 30 defined by prl and pr2 is planar and the centre of curvature 
CN of the actual reconstruction 30' is shown (and was derived geometrically). In 
25 Figures 10A to 10C, the misalignment is generated by rotation of the projector pr2 
about its perspective centre by 5 degrees, 10 degrees and 15 degrees respectively. 

The radius of curvature R of reconstruction 30' is inversely proportional to the 

misalignment as shown in Figure 11. It should be noted that in Figures 10A to IOC, 

30 ° 

the angle subtended by the ray lines at their intersection at their corresponding point 
at (the reconstruction 30 of) the object surface is very much less than the optimum 
angle of 90 degrees and hence the degree of curvature is very much greater than 
would normally be obtained in practice. 

35 

Hence the above method of the invention allows a considerable latitude in the 
orientation of the projectors which implies that a considerable uncertainty in the 
relative orientation of the camera at the two viewpoints is permissible. 
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The corrections noted above can be applied at any stage during the generation or 
fitting together of the partial 3D reconstructions. 

It should be noted however that only a limited misalignment of the virtual projectors 
prl and pr2 (Figure 3) is permissible. Referring to Figure 3, all the principal ray 
lines projected from the correlated points will intersect only if the projectors have 
the correct orientation. The latitude in orientation arises only from the finite 
resolution of the camera(s) which implies that the ray lines from the projectors have 
a finite thickness or do not intersect exactly. Thus a point on the reconstruction 30 
can be found by determining the midpoint of the shortest line joining the ray lines 
from the respective projectors, provided that the length of this shortest line does not 
exceed a predetrmined limit corresponding to the resolution of the camera(s). 

A projector-camera embodiment closely analagous to the camera-camera 
embodiment of Figure 1 will now be described with reference to Figure 12. 



Referring to Figure 12, the apparatus comprises a personal computer 4 (eg a 

20 Pentium® PC) having conventional CPU, ROM, RAM and a hard drive and a frame 
grabber connection at an input port to a digital camera CM as well as a video output 
port connected to a screen 5 and conventional input ports connected to a keyboard 
and a mouse 6 or other pointing device. The hard drive is loaded with conventional 

operating system such as Windows ®95 and software: 

25 

a) to display images acquired by the camera CM; 

b) to generate correspondences between overlapping regions of images input from 
the camera CM; 

30 

c) to derive from the acquired images the baseline joining the perspective centres of 
the camera and projector; 

35 d) to project images acquired by the camera into a simulated 3D space from virtual 
projectors located on the baseline at a separation selected (with the keyboard or 
pointing device) by the user or determined (thereby creating a partial 3D 
reconstruction); 
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e) to scale the partial 3D reconstruction along one or more axes and combine such 
partial 3D reconstructions as illustrated in Figures 17, 18 and 19, and 

5 f) to determine the separation of the perspective centres of the camera and projector 
along the baseline from further correlations of object images and calibration images, 
and thence derive an accurate partial 3D reconstruction of the object surface. 

Additionally the software is preferably arranged to correct the images for distortion 
10 due eg to curvature of field of the camera and projector optics before they are 
processed as described above, either during an initial calibration procedure or as 
part of a ray bundle adjustment process during the processing of the object and 
calibration image(s). Suitable correction and calibration procedures are described by 
Tsai in "An Efficient and Accurate Camera Calibration Technique for 3D Machine 
1:> Vision" Proc IEEE CVPR (1986) pp 364-374 {supra) and will not be described 
further. 

The camera, a Pulnix M-9701 progressive scan digital camera, is shown mounted at 
o 0 one end of a support frame F on eg a ball and socket mounting and a slide projector 
PR is shown securely mounted on the other end of the frame. Slide projector PR is 
provided with a speckle pattern slide S and is arranged to project the resulting 
speckle pattern onto the surface of region R of an object 3 which is in the field of 
view of camera CM. 

25 

The intrinsic camera parameters are initially determined by acquiring images of a 
reference plate (not shown) in known positions. The reference plate is planar and has 
an array of printed blobs of uniform and known spacing. The following parameters 
are determined and are therefore assumed to be known in the subsequent description: 

30 

i) focal length of the camera 

ii) distortion parameters of the lenses of the camera and projector 

iii) scale factor 

35 iv) image coordinates of principal point. 

Additionally the pixel size (determined by the camera manucturer) is assumed to be 
known. 




WO 99/60525 



PCT/GB99/01556 



Optionally, the following extrinsic camera parameters are determined: 



5 



a) camera location 

b) camera orientation. 



Alternatively the camera location and orientation can be taken to define the 
coordinate system relative to which the object surface coordinates are determined. 

10 Referring now to Figure 2, the camera CM is shown with its perspective centre Oc 
located on a baseline vector V and viewing (initially) a target surface T and 
(subsequently) object 3. The (virtual) origin or perspective centre Op of projector PR 
also lies on baseline vector V and is defined by the optical system of the projector 
comprising field lenses OL and condenser lenses CL. A point light source LS such 
as a filament bulb illuminates slide S and directs a speckle pattern onto (initially) 
target surface T and (subsequently) the surface of object 3. 

The baseline vector V is found by the following procedure: 

20 

Firstly an image II (Figure 14) of the region of the surface of object 3 illuminated 
by the projected speckle pattern is acquired and stored in the memory of computer 4 
and an arbitrary group of at least two spaced apart points Ql and Q2 of this region 
are selected as points ql and q2 in the image formed on the photodetector plane PD 
- 5 of the camera. The group of points ql and q2 is stored. 

Secondly the object 3 is substituted by target surface T and an image 12 (Figure 3) of 
the illuminated region of the target surface is acquired by camera CM. The position 
and orientation of the target T relative to the camera are found by acquiring an image 
of the target in the absence of any illumination from the projector, utilising a known 
pattern of blobs BL formed on the periphery of the target. The image 12 is stored and 
and a patch defined by its central point (eg Q n , Figure 3) of the first image II is 
correlated with the corresponding point P n of the second image 12 by selecting a 
35 surrounding region R of initially 3x3 pixels and, by comparing local radiometric 
intensity distributions by means of the above-described modified Gruens algorithm, 
searching for the corresponding region R* in image 12 which is allowed to be 
distorted with an affine geometric distortion (eg in the simple case illustrated in 
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Figure 14, horizontally elongated). The correlated patch is expanded (up to a 
maximum of 19 x 19 pixels) and the process is repeated. In this manner the 
corresponding point P n is found. 

5 This process is repeated to find a large number of pairs of correspondences PQ 
(Figure 14) and in particular to correlate the patches centered on PI, P2 (Figure 13) 
with the points in the group Ql, Q2 (Figure 13). Since the algorithm has a sub-pixel 
resolution, the latter are not necessarily centred on particular pixels. 

10 In the following geometric discussion the correspondences are treated for the sake of 
simplicity as correlated pairs of points but it should be noted that this does not imply 
anything about their topography - in particular it does not imply that they lie at 
corners or edges of the object, for example. 

15 

Referring to Figure 13, the origin Op (perspective centre) of the projector PR will lie 
at the intersection of P1Q1 and P2Q2. However the 3D locations of these four points 
are not known, only the ray lines from the camera on which they lie, namely plPl, 
qlQl, p2P2 and q2Q2. But the line P1Q1 will lie in the plane OcPlQl ie plane 
20 Ocplql which is available form the calibration process and the two images II and 
12 and the line P2Q2 will lie in the plane OcP2Q2 ie plane Ocp2q2 which is 
similarly available from the calibration process and the two images II and 12. These 
planes define a baseline vector V by their intersection, which passes through OC and 
the perspective centre Op of the projector. 

25 

A particularly simple way of finding the baseline vector V is to project plql and 
p2q2 which will meet at a point X in the plane of photodetector PD. The projection 
from point X through the perspective centre Oc is the baseline vector V as shown. 

30 

In this manner the baseline vector V can be determined, though not the position of 
the projector origin Op along this baseline. In practice the groups of points P and Q 
will each comprise more than two pairs and hence overdetermine the baseline vector 
V. Accordingly the computer 4 is preferably arranged to derive a bundle of such 
35 vectors as determined by the sets of points PQ, to eliminate "outliers" ie those 
vectors which deviate by more than a given threshold from the mean and to perform 
a least squares estimate of vector V on the remainder of the vectors, in accordance 
with known statistical methods. 
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The derivation of a three-dimensional representation of the object 3 is shown in the 
ray diagram of Figure 3. 

The camera CM and projector PR are shown located on baseline vector V. A first 
virtual projector prl is implemented by the image processing software in computer 4 
and has the same optical characteristics as the camera (as determined in the initial 
calibration procedure). Image 11 (Figure 14) is projected from this virtual projector 
in a 3D space simulated by the image processing software. 

A second virtual projector pr2 is similarly implemented by the image processing 
software and preferably has the same optical characteristics as the projector PR 
(which is also represented in Figure 3). This virtual projector projects a set of ray 
lines in the simulated 3D space corresponding to the respective physical projector 
rays PQ and the ray lines are each labelled with the respective correlated pixels of 
the image II as found in the image correlation process described with reference to 
Figure 14. It will be appreciated that the image 12 and target T define, and can be 
equated with, a set of rays originating from the perspective centre Op of the 
projector. Since it is known which ray line from the projector PR/prl intersects each 
ray line from its corresponding pixel in image II, the point in 3D space 
corresponding to each intersection can be found, and hence the set of points Qa, Qb, 
Qc... defining the surface. 

In practice many ray lines will not intersect and the best estimate of the 
corresponding 3D surface point will be the mid-point of the peipendicular line 
joining them at their closest approach. Algorithms for this purpose are known per se. 

In the above discussion of Figure 3 it has been assumed that the relative positions of 
the camera CM and and projector PR on baseline vector V (and hence the positions 
of the virtual projectors prl and pr2) are known. In fact these are assumed or entered 
as a scaling variable by the user of the computer 4. 

That the ray lines from the respective virtual projectors will intersect irrespective of 
the spacing between the virtual projectors, assuming that their orientations are 
unchanged, is illustrated in Figure 3 by the alternative virtual projector position pr2 ! 
corresponding to an assumed real projector position PR*. The resulting 3D 
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reconstruction of the object 3' will be different in size but of the same shape: a 
single scaling factor will be required to interconvert objects 3 and 3\ However it 
may be convenient in practice to provide for different horizontal and vertical scaling 
factors (eg because of different horizontal and vertical magnifications of the camera) 
and in general only one set of scaling factors will be consistent with fitting together 
a set of partial 3D surfaces of the same object acquired from different directions. 

Accordingly the software in computer 4 is arranged to scale such acquired 3D 
representations to enable them to be fitted together to form a self-consistent overall 
3D representation of the object. This aspect is described below with reference to 
Figures 17 to 20. 

Before doing so however, an alternative calibration procedure will be described with 
reference to Figure 15, which shows two planar calibration targets Tl and T2 
(having peripheral blobs or discs BL similar to target T of Figure 13) whose 
orientations (and preferably positions) relative to the camera axis system are known, 
eg as a result of a photogrammetric determination involving separately acquiring 
images of them in the absence of any illumination from the projector, and processing 
the images in a procedure similar to that described above in connection with Figure 
2. The perspective centres Oc and Op of the camera and projector are also shown. 

In a first stage of the calibration procedure, target Tl is illuminated by the structured 
light from the projector and an image is acquired by the camera CM. Figure 15 
illustrates three points pi, p2 and p3 at which the structured light impinges on target 
Tl. These (and many other points, not shown) will be imaged by the camera CM. 

In a second stage of the calibration procedure, target Tl is removed and target T2 is 
illuminated by the structured light from the projector. An image is acquired by the 
camera CM. The three points PI, P2 and P3 corresponding to points pi, p2 and p3 
are found by correlating the newly acquired image of the projection of the structured 
radiation on target T2 with the previously acquired image of the corresponding 
projection on Tl by the procedure described above with reference to Figure 13. 

Figure 15 illustrates further the relationship between the positions of two calibration 
targets Tl and T2 and the perspective centre Op of the projector PR (the camera CM 
being assumed fixed on the baseline vector V). A pair of points PI and P2 on target 
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Tl form image points pi and p2 respectively on the photodetector array PD of 
camera CM and (in a subsequent step following the removal of target Tl) a pair of 
points P3 and P4 on target T2 which are correlated with PI and P2 respectively 
form image points p3 and p4 respectively on photo detector array PD. 

5 

Accordingly in a third stage the pencil of rays formed by corresponding points on 
targets Tl and T2 (eg PI and P3; P2 and P4) is constructed to find the position of the 
perspective centre Op of the projector. In practice the rays will not intersect at a 
point but a best estimate can be found from a least squares algorithm. 

10 

It will be appreciated that other calibration procedures are possible. For example the 
camera could be calibrated by the Tsai method (Roger Y Tsai, IEEE Journal of 
Robotics and Automation RA-3, No 4, August 1987 p 323 - see also references cited 
therein). 

15 
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Claims 



10 
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30 



1. A method of deriving a 3D representation (30) of at least part of an object (3) from 
correlated overlapping 2D images of the object acquired from different spaced apart 
viewpoints (CM, CM') relative to the object, the separation between the viewpoints 
not being precisely known, the method comprising the step of digitally processing 
the 2D images to form a 3D representation which extends in a simulated 3D space in 
dependence upon both the mutual offset between correspondences of the respective 
2D images and a scaling variable, the scaling variable being representative of the 
separation between the viewpoints at which the 2D images were acquired. 

2. A method of deriving a 3D representation (30) of at least part of an object (3) 
from a 2D image thereof, comprising the steps of illuminating the object with 
structured projected optical radiation, acquiring a 2D image (II) of the illuminated 
object, correlating the 2D image with rays of the structured optical radiation, and 
digitally processing the 2D image to form a 3D representation which extends in a 
simulated 3D space in dependence upon both the correlation and a scaling variable, 
the scaling variable being representative of the separation between a location from 
which the structured optical radiation is projected and the viewpoint at which the 2D 
image is acquired. 

3. A method as claimed in claim 1 or claim 2 wherein a view of the representation 
(30) in the simulated 3D space is displayed and the scaling variable is entered by a 
user. 

4. A method as claimed in any preceding claim, comprising the step of acquiring 
overlapping 2D images from a camera (CM) which is moved relative to the object, 
the net movement of the camera not being fully constrained. 

5. A method as claimed in claim 4 wherein the respective orientations of the camera 
(CM) at the different viewpoints relative to a reference frame which is fixed with 
respect to the object (3) differ by less than 45 degrees. 

6. A method as claimed in claim 5 wherein the difference between said respective 
orientations of the camera (CM) is less than 30 degrees. 
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7. A method as claimed in claim 6 wherein the difference between said respective 
orientations of the camera (CM) is less than 10 degrees. 



8. A method as claimed in any of claims 4 to 7 wherein the 2D images are acquired 
5 by a hand-held camera (CM). 

9. A method as claimed in any preceding claim wherein at least one 2D image is 
acquired by a camera (CM) whose orientation is determined from an output signal of 
an inertial sensor (G). 

10 

10. A method as claimed in any preceding claim wherein the 3D representation (30) 
is generated by projections from positions on a straight line (V) in simulated 3D 
space which corresponds to the straight line joining respective perspective centres of 

25 said 2D images or joining respective perspective centres of said structured optical 
radiation and 2D image of the illuminated object (3). 

11. A method as claimed in claim 10 wherein said straight line (V) in simulated 3D 
space is determined from a pencil of projections from correspondences of aligned 

20 2D images (II, 12). 

12. A method as claimed in claim 10 wherein said straight line in simulated 3D space 
(V) is determined from the intersection of planes (OcPlOp, OcP20p) defined by at 
least one perspective centre (Oc» Op) and at least two pairs of correspondences (PQ) 

25 

between acquired 2D images (II, 12). 

13. A method as claimed in any preceding claim wherein said scaling variable is 
varied by the user to enable the 3D representation (Rl) to be fitted to another, 

30 similarly derived 3D representation (R2). 

14. A method as claimed in any preceding claim wherein the 3D representation (30) 
is generated from the intersection of respective projections from spaced apart 
perspective centres (Oc» Op), the perspective centres being derived from the mutual 

35 offset between a first pair of correspondences (PQ) between respective 2D images 
(II, 12) and from a further mutual offset between a second pair of correspondences 
between respective 2D images, further pairs of correspondences are derived from a 
search constrained by the above perspective centre determination and the 3D 
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representation of the object is derived from the further pairs of correspondences and 
the projections. 

15. A method as claimed in claim 14 wherein said calculation is performed on fewer 
than one thousand pairs of image correspondences (PQ). 

16. A method as claimed in claim 15 wherein said calculation is performed on fewer 
than one hundred pairs of image correspondences (PQ). 

17. A method as claimed in claim 16 wherein said calculation is performed on fewer 
than fifty pairs of image correspondences (PQ). 

18. A method as claimed in claim 17 wherein said calculation is performed on eight 
or fewer pairs of image correspondences (PQ). 

19. A method as claimed in claim 18 wherein said calculation is performed on two or 
three or four pairs of image correspondences (PQ). 

20. A method as claimed in any preceding claim comprising the step of repeating the 
method of claim 1 or claim 2 by digitally processing further 2D images of the object 
acquired from different further viewpoints (CMA, CMA*) to form a further 3D 
representation, the first-mentioned 3D representation and the further 3D 
representation being combined by manipulations in a simulated 3D space involving 
one or more of rotation and translation, any remaining discrepancies between the 3D 
recpresentations optionally being reduced or eliminated by scaling one 3D 
representation relative to the other along at least one axis. 

21. A method as claimed in claim 20 wherein at least two of the 3D representations 
(Rl, R2) are simultaneously displayed on screen (5) and their manipulations are 
performed in response to commands entered by a user, 

22. A method as claimed in claim 21 wherein the manipulations of the 3D 
representations (Rl, R2) are performed under the control of a computer pointing 
device (6) operated by the user. 

23. A method as claimed in any of claims 20 to 22 wherein a distortion parameter is 
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entered by the user and applied to said first-mentioned and/or said further 3D 
representation. 

24. A method as claimed in claim 23 wherein an initial 3D representation (30\ 
5 Figure 9) in simulated 3D space is generated by intersecting projections from spaced 

apart perspective centres, and the initial 3D representation is rotated whilst 
constraining the points of the initial 3D representation which are generated from the 
intersecting projections to lie on the projections of those features from their 
respective perspective centres, thereby forming a further 3D representation (30C, 
10 Figure 9). 

25. A method as claimed in any of claims 20 to 24 wherein a further parameter 
indicative of curvature of field is entered by the user and used to adjust the curvature 
of field of said first-mentioned and/or said further 3D representation. 

26. Image processing apparatus for deriving a 3D representation of at least part of an 
object from correlated overlapping 2D images of the object acquired from different 
spaced apart viewpoints relative to the object, the apparatus comprising image 

20 processing means (4) which is arranged to digitally process the 2D images to form a 
3D representation (30) which extends in a simulated 3D space in dependence upon 
both the mutual offset between correspondences of the respective 2D images and a 
scaling variable, the scaling variable being representative of the separation between 
the viewpoints (CM, CM') at which the 2D images were acquired. 

25 

27. Image processing apparatus for deriving a 3D representation of at least part of an 
object from a 2D image of the illuminated object, the object being illuminated with 
structured optical radiation projected from a location (Op) spaced apart from the 
viewpoint (Oc) at which the 2D image is acquired, the 2D image being correlated 
with the structured radiation, the apparatus comprising digital processing means (4) 
arranged to form a 3D representation which extends in a simulated 3D space in 
dependence upon both the correlation and a scaling variable, the scaling variable 
being representative of the separation between the location from which the 

35 structured optical radiation is projected and the viewpoint at which the 2D image is 
acquired. 

28. Apparatus as claimed in claim 26 or claim 27 comprising display means (5) 
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arranged to display a view of the representation in simulated 3D space, the size of 
the displayed representation being dependent upon the value of the scaling variable . 

29. Image processing apparatus as claimed in claim 28, further comprising a camera 
5 (CM) whose position and/or orientation are not fully constrained with respect to the 

frame of the object, the camera being arranged to acquire said 2D images. 

30. Apparatus according to claim 29 comprising inertial sensor means (G) arranged 
to determine the orientation of the camera relative to the object at the time of 

10 acquisition of said 2D images. 

31. Apparatus as claimed in any of claims 26 to 30 which is arranged to generate the 
3D representation from the intersection of respective projections from spaced apart 
perspective centres (prl, Pr2/pr2'), the perspective centres being derived from the 
mutual offset between a first pair of correspondences between respective 2D images 
and from a further mutual offset between a second pair of correspondences between 
respective 2D images, to derive further pairs of correspondences from a search 
constrained by the above perspective centre determination and to derive the 3D 

20 representation of the object from the further pairs of correspondences and the 
projections. 

32. Apparatus as claimed in any of claims 26 to 3 1 which is arranged to derive a 
further 3D representation from further intersections of further projections from 

25 further perspective centres (CMA.m CMA'), the apparatus including combining 
means (4) arranged to combine the first-mentioned 3D representation and the further 
3D representation by manipulations in a simulated 3D space involving one or more 
of rotation and translation, the apparatus further comprising scaling means (BN, Wl, 
W2) arranged to reduce or eliminate any remaining discrepancies between the 3D 
representations by scaling one 3D representation relative to the other along at least 
one axis. 

33. Apparatus as claimed in claim 32 which is arranged to display both 3D 
35 representations (Rl, R2) simultaneously and to manipulate them in simulated 3D 

space in response to commands entered by a user. 

34. Apparatus as claimed in claim 32 or claim 33 which is arranged to correct 
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distortion of said first-mentioned and/or said further 3D representation resulting 
from an incorrect or incomplete calculation of a said perspective centre. 

35. Apparatus as claimed in any of claims 32 to 34 which is arranged to correct the 
curvature of field of said first-mentioned and/or said further 3D representation (30* 
Figure 9) resulting from an incorrect or incomplete calculation of a said perspective 
centre (pr2' - Figure 9). 

36. A method of determining the motion of a camera (CM) relative to an object (3) 
in the field of view of the camera comprising the steps of projecting the paths (LI, 
L2, L3 - Figure 4) of features of the image of the object to a common vanishing 
point (VP) and determining the vector (V) between the perspective centre of the 
camera (O) and this vanishing point. 

37. Apparatus for determining the motion of a camera relative to an object in the 
field of view of the camera comprising means (4) for projecting the paths (LI, L2, 
13 - Figure 4) of features of the image of the object to a common vanishing point 
and means for determining the vector (V) between the perspective centre (O) of the 
camera and this vanishing point. 

38. A method of generating a 3D reconstruction of an object (3) comprising the steps 
of projecting images of the object acquired by mutually aligned cameras (CM, CM*) 
into simulated 3D space from aligned virtual projectors (prl, pr2), the separation of 
the virtual projectors being variable by the user. 

39. Apparatus for generating a 3D reconstruction of an object (3) comprising two 
aligned virtual projector means (prl, pf2) arranged to project images of the object 
acquired by mutually aligned cameras into simulated 3D space, the separation of the 
virtual projectors being variable by the user. 

40. Apparatus for generating a 3D representation of at least part of an object (3) 
from an object image (II) of the projection of structured optical radiation onto the 
object surface and from at least one calibration image (12) of the projection of the 
structured optical radiation onto a surface displaced from the object surface, the 
apparatus comprising image processing means (4) arranged to generate 
correspondences (PQ) between at least one calibration image and the object image 
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and optionally a further calibration image, and reconstruction processing means 
arranged to simulate a first projection of the object image and a second projection 
linking respective correspondences of at least two of the correlated images and to 
derive said 3D representation from the mutual intersections of the first and second 
5 projections. 

41. Apparatus as claimed in claim 40 wherein the first and second projections are 
from a baseline (V) linking an origin of the structured optical radiation (Op) and a 
perspective centre (Oc) associated with the images (II, 12), the reconstruction 

10 processing means (4) being arranged to derive said baseline from the correlation 
(PQ). 

42. Apparatus as claimed in claim 40 or claim 41 wherein the image processing 
means (4) is arranged to correlate two or more calibration images and to determine 

15 

the spacing between origins of the first and second projections (Oc, Op) in 
dependence upon both the correlation of the two or more calibration images and 
input or stored metric information associated with the calibration . 

20 43. Apparatus as claimed in any of claims 40 to 42 further comprising projector 
means (PR) arranged to project the structured optical radiation onto the object 
surface and at least one calibration surface (Tl, T2). 

44. Apparatus as claimed in any of claims 40 to 42 further comprising a camera 
25 (CM) arranged to acquire the object image (II) and at least one calibration image 

(12). 

45. Apparatus as claimed in any of claims 40 to 42 further comprising at least one 
calibration target (Tl, T2) which in use is illuminated by the structured radiation. 

30 

46. Apparatus as claimed in any of claims 40 to 42 wherein the image processing 
means (4) is arranged to correlate pixels in one of said images (II) with 
corresponding locations in the other of said images (12) by comparing the local 

35 radiometric distributions associated with said pixels and locations respectively. 

47. Apparatus as claimed in claim 46 wherein the image processing means (4) is 
arranged to allow a radiometric and/or geometric distortion during the correlation 
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48. A method of generating a 3D representation of an object (3) from an object 
image (II) of the projection of structured optical radiation onto the object surface 
and from at least one calibration image (12) of the projection of the structured optical 
radiation onto a surface displaced from the object surface, the method comprising the 
steps of: 

i) correlating at least one calibration image with the object image and optionally with 
a further calibration image; 

ii) simulating a first projection of the object image and a second projection of the 
structured optical radiation, and 

iii) deriving said 3D representation from the mutual intersections of the first and 
second projections. 

49. A method as claimed in claim 48 wherein the first and second projections are 
from a baseline (V) linking an origin of the structured optical radiation (Op) and a 
perspective centre (Oc) associated with the image respectively, said baseline being 
derived from two or more pairs of image correspondences (PQ). 

50. A method as claimed in claim 48 or claim 49 wherein two or more calibration 
images are correlated and the spacing between origins of the first and second 
projections (Oc» Op) is determined in dependence upon both the correlation of the 
two or more calibration images and input or stored metric information associated 
with the calibration images. 

51. A method as claimed in any of claims 48 to 50 wherein regions (R) of said 
images (II, 12) are correlated by comparing the local radiometric and/or colorimetric 
distributions associated with said regions. 

52. A method as claimed in claim 51 wherein a radiometric and/or geometric 
distortion is allowed between potentially corresponding regions (R). 
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